arXiv:1509.07875v2 [astro-ph.CO] 7 Mar 2016 


Prepared for submission to JCAP 


Towards physics responsible for 
large-scale Lyman-o: forest bias 
parameters 

Agnieszka M. Cieplak, Anze Slosar 

Brookhaven National Laboratory, 

Bldg 510A, Upton NY 11973, USA 

E-mail: acieplak@bnl.gov, anze@bnl.gov 

Abstract. Using a series of carefully constructed numerical experiments based on hydrody¬ 
namic cosmological SPH simulations, we attempt to build an intuition for the relevant physics 
behind the large scale density {bs) and velocity gradient (6,,) biases of the Lyman-a forest. 
Starting with the fluctuating Gunn-Peterson approximation applied to the smoothed total 
density field in real-space, and progressing through redshift-space with no thermal broaden¬ 
ing, redshift-space with thermal broadening and hydrodynamicaly simulated baryon fields, 
we investigate how approximations found in the literature fare. We find that Seljak’s 2012 
analytical formulae for these bias parameters work surprisingly well in the limit of no thermal 
broadening and linear redshift-space distortions. We also show that his brj formula is exact 
in the limit of no thermal broadening. Since introduction of thermal broadening significantly 
affects its value, we speculate that a combination of large-scale measurements of br^ and the 
small scale flux PDF might be a sensitive probe of the thermal state of the IGM. We find 
that large-scale biases derived from the smoothed total matter field are within 10-20% to 
those based on hydrodynamical quantities, in line with other measurements in the literature. 
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1 Introduction 

Measuring the growth of large scale structure has become a powerful probe of testing cosmo¬ 
logical parameters. The Lyman-a forest has emerged as one of the primary tracers of this 
large scale structure at intermediate redshifts. Seen as absorption features in quasar spectra, 
the forest probes the distribution of neutral hydrogen between the observer and the quasar. 
The Lyman-a absorption of neutral hydrogen has a particularly large cross section, and as 
the emission continuum of a quasar becomes redshifted to this transition wavelength, it will 
be absorbed even with small fractions of neutral hydrogen present. This method is therefore 
sensitive to low gas densities, serving as a tracer of the large scale matter distribution between 
the redshifts of 1.7 (below which the gas becomes fully ionised while at the same time the UV 
light continuum becomes redshifted to wavelengths absorbed by the atmosphere) and red¬ 
shifts 4-5 (above which the forest becomes opaque). At small scales, the Lyman-a forest has 
given us unique constraints on, for example, warm dark matter models [1, 2] and primordial 
black hole dark matter [3], whereas on intermediate scales it has probed limits on the sum 
of the neutrino masses [4-6], the running of the spectral index [6], or dark matter - baryon 
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scattering [7]. Recently, at large scales, it has provided a detection of the baryon acoustic 
oscillations at intermediate redshifts [8-10]. With the completion of the BOSS survey [11], 
and the start of eBOSS [12], as well as the planned DESI survey [13], future Lyman-a forest 
measurements will be covering larger scales with greater precision. It is therefore crucial that 
we also improve our understanding of the biases involved between what is observed and the 
underlying matter distribution. 

The relationship between the measured Lyman-a forest flux and the underlying matter 
is highly nonlinear, but the physics is thought to be well-understood. Underlying neutral 
hydrogen gas is related to this measured flux via the optical depth, r = —InF, which is pro¬ 
portional to the Lyman-a absorption cross section and the number of neutral hydrogen atoms 
along the line of sight (los). If photoionization equilibrium at these redshifts is assumed, the 
amount of neutral hydrogen is related in a temperature-dependent way to the total number of 
hydrogen atoms, which in turn traces the underlying matter density. Additionally, assuming 
adiabatic expansion, which produces a tight temperature-density relation, 7 — 1 = dlnp/dlnT 
[14], the optical depth can be modeled as r = A{1 + 5)", where 5 is the gas overdensity, A 
is a constant which depends on the photoionization rate, the temperature of the gas, and 
redshift, while a = 2 — 0 . 7(7 ~ !)• The gas overdensity traces the total matter overdensity 
down to the Jeans scale, where gas pressure supports against gravitational collapse. 

This Fluctuating Gunn Peterson Approximation (FGPA) [15] , although it ignores shock 
heated gas and thermal broadening, has been surprisingly successful in explaining statistical 
properties of the Lyman-a forest observations. Because it is a local transformation, the flux 
fluctuations, defined as, 6 f = F/F — 1 (with F being the mean absorbed flux) will trace the 
dynamically dominant helds on large scales [16, 17]: 

6f = hs5+ hr,r]+ hY'5r + e. ( 1 . 1 ) 

This equation is valid in Fourier-space in the limit /c —)• 0 or on a sufficiently smoothed 
real-space field (i.e where fluctuations are smoothed so that the Taylor expansion in 6 is 
valid). The quantities J, 5-^ and Jr are relative fluctuations in the total density, mass- 
weighted velocity gradient rj = —H~^dv\\/dr^^ and photo-ionization fluctuations rate T. The 
bias parameters bs, brj and br encode properties of the small-scale physics that determines 
their values on large-scales. In this paper we will use these symbols to denote absorbed 
flux fluctuation {6f) biases, but biases for other helds will denote tracer held, e.g. br ,5 for 
the density bias of the 5-r held (all tracer helds will be normalized to the mean). Finally, e 
describes the scatter between the tracer held 8f and the prediction from source helds. It is 
expected to have a white power P/v on large scales. We will not consider photoionization 
huctuations in this paper, but see e.g. [18, 19] and references there-in. 

Ignoring the T held, it then follows that the three-dimensional power-spectrum of hux 
on large scales is given by 

Ps^{k) = bl{l + l5iJi^ f Ps{k) + PN (1.2) 

where (5 = fb^i/bs, with / being the logarithmic growth rate, ^ is the cosine of the angle 
between the los and the vector k, and Ps is the total matter power spectrum. This is 
equivalent to the famous Kaiser [20] redshift-space distortion (RSD) formula for galaxies, 
but note that in our case fi is not uniquely determined by bs, as we will discuss later in the 
paper. The noise power Pjv is expected to be small and is often ignored. 

Parameters bs and brj (or equivalently f3) can be thought of as purely phenomenological 
parameters to be marginalised over as is done in e.g. BAG analyses. On the other hand they 
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encode the same small scale physics that is used when predicting small scale ID power spectra 
that has been used to constrain cosmology. Therefore, with a goal to eventually be able to 
perform a unified analysis of correlations in the Lyman-a forest, from smallest to the largest 
available scales, rather then artificially splitting the problem into a small-scale ID power 
spectrum and a large-scale correlation function, the underlying physics that determines the 
values of these parameters has to be well-understood. 

This paper attempts to bring some understanding to what determines the numerical 
value of these large-scale biases and how well these can be described using analytical predic¬ 
tions. The main focus in this study is not on using these predictions for precision cosmology, 
but primarily to first understand how accurate these predictions are or when do they fail 
in describing results of A^-body simulations. This expands and complements previous work 
trying to understand the large scale fluctuations in the Lyman-a forest [16, 17, 21-25]. 

2 Theory 

The basic starting point for derivation of biases is the notion of peak-background split (PBS)^, 
namely that a large over-dense region of the universe is equivalent to a slightly over-dense 
universe and that a large region of universe with non-zero r] is equivalent to an anisotropic 
universe with a velocity gradient. This picture is exact in the A: —>■ 0 limit and gives the 
following simple prescription for the calculation of bias parameters[16, 17, 26]: 

h = if l,=o (2.1) 

These formulae can be used both in theoretical modeling and in simulations. 

This approach was used by [26] to make a first analytical attempt at understanding the 
large scale bias and redshift space distortion parameter. By using the second order pertur¬ 
bation theory gravitational coupling between short and long wavelength modes, analytical 
formulae can be obtained that link bs and br^ to the observed flux probability distribution 
function (PDF). 

Approximations used in that paper were 

• The second order standard perturbation theory gives sufficiently accurate predictions 
bias of density field to n-th power 

• The flux field is fully determined by the FGPA formula 

F = exp [(-A(l + ,5)“] (2.3) 

• No thermal broadening of the lines 

Somewhat surprisingly, we will later discover that the inaccuracy of these equations is mostly 
due to thermal broadening and the non-linearity of the velocity field. 

^The etymology of the wording “peak-background” split comes from thinking about the evolution of dark 
matter halos as separate from the evolution of the large-scale overdensity - for all practical purposes the halo 
cannot see the difference between a different background cosmology and a sufficiently large-wavelength mode. 
The same idea often goes under the name “separate universe” method. Of course, in our case, there are no 
halos, but we resign to use this widely adopted phrase to refer to the fact that small-scale non-linear evolution 
cannot distinguish between a large-scale linear mode and a modified background. 
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2.1 Density bias 

We will proceed to review the derivation of bias parameters. Our treatment is somewhat 
different from that of [26] in that it makes the peak-background argument more obvious, 
but we arrive at identical results. In the presence of a long wavelength density mode, 5i, 
in an Einstein-deSitter universe, the short wavelength density modes, in second order 
perturbation theory become 

6r^ = {l + V25i)5s^ (2.4) 

where V 2 = 34/21, is the angular average of the second order perturbation theory kernel 
F 2 {ki,k 2 ). This relation does not require that 5s modes are linear, it simply provides an 
approximate mapping between a set of non-linear modes in a patch of universe as a function 
of large-scale over-density. 

Therefore, the total density transforms as 

5 + 5i = {l + V25 i)5s + 5i. (2.5) 


The mean for a quantity X in the presence of large-scale mode 5i are thus given by 


W = 



X{5 = <5(1 V25i) + 5i)p{5)d5, 


( 2 . 6 ) 


where p{5) is the probability distribution function for 5. Using Equation (2.1), this allows 
us to calculate bias for any quantity which is uniquely determined by 5 and connect it to its 
PDF. Applying this to matter fields to the n-th power (d”) (and normalizing by the mean of 
this tracer field), we hnd 

bsn^s = niy2 + n(5^~^}/(5^}. (2.7) 


The large scale bias for the Lyman-a forest optical depth with respect to the underlying 
density is given by 


h —1 —1 

d5i 


V2 


dr 


_ 1 / dr 


(2.8) 


For the simple isothermal density-temperature relation, where a = 2, this gives 




2(1 -k V2cr]) 

1 -I- CT j 


(2.9) 


where ctj is the rms density field smoothed on a Jeans scale. 

The bias for optical depth is a theoretically interesting quantity, but it is not what we 
need. Ignoring redshift space distortions in the optical depth for now, the flux bias relating 
the flux to the underlying density field, can then be calculated as: 


bs 


1 5F[r(J)] 
F dSi 


1 

F 



( 2 . 10 ) 


Using Equation 2.3, the above bias can be expressed in terms of the constants A and a: 


bs = F-^[a{FlnF) + a{v 2 - 1) (FlnF[l - (-InF/A)-" 'j^j. (2.11) 

Remarkably, therefore, this theoretical bias could be calculated from just the constants 
A and a, which could be derived from the temperature-density scatter plots from simulations, 
and the observed flux PDF in the data. 
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2.2 Velocity gradient bias 

For the velocity gradient bias, consider the effect of r/ on a large patch of the universe. The 
redshift-space coordinate s = r + transforms as 

s —>■ s — rr/. (2-12) 

Action of the rj field is thus equivalent to stretching the redshift-space in the radial direction. 
Since stretching cannot change the total number of hydrogen atoms, the mean value of r 
must change by (1 — giving br^r] = 1- The same argument holds for all other fields that 
are conserved on transformation from real to redshift space (e.g. number of galaxies^). In 
the limit of no thermal broadening, the r simply responds as pure rescaling 


T(r) ^ ^ 

— ^ —I = r (r(l - T]) M (1 -h T/) -b 0{rj^) 

— T] 

(2.13) 

Using Equation 2.2, 

F = [ p{F)FdF 

Jo 

(2.14) 

and T = — InT, one can derive 



br, = {FlnF ). 

(2.15) 


This equation was derived by Seljak by starting with the relation r(s) = r(r)(l + /^^(5) 
and then Taylor expanding the expression for F. The above shows that it is in fact much 
more general and gives the correct answer on large scales even if the distortions are not 
Kaiser-like in r on intermediate scales, as long as velocity dispersion is negligible. However, 
as we will see later, the presence of thermal broadening is significant and breaks this formula, 
since the broadening kernel does not stretch with the {1 — rj) factor. 

In this paper we use SPH simulations to test these analytical formulae. We start with 
the smoothed total matter field part of these simulations and apply the FPGA equation 
exactly in real space to see how well they fare in this idealised toy-case. We then proceed to 
progressively more realistic cases, ending with the analysis of the full hydrodynamic part of 
the simulation. 

3 Description of Numerical Methods and Simulations Used 

The simulations used for this comparison study are Gadget-3 [27] hydrodynamic simulations 
with a box size of L = 40 Mpc/h with N = 2 x 1024^ for the number of gas and dark matter 
particles. They are evolved with a Haardt and Madau UV background [28] and the simple 
QUIGKLYA option for star formation with no feedback, where any SPH particle that reaches 
a maximum density is converted into a star particle. The standard critical over-density, as 
used in these simulations, is 1000 times the mean. The SPH smoothing length is such that 
the particle’s density is estimated by smoothing over 32 neighboring particles. The fiducial 
simulation uses the WMAP7 cosmology. 

^As an interesting aside: One could imagine that the mean value of a field would respond non-trivially 
to a global anisotropic expansion even in real-space, which would then lead to 6 r, 7 ^ 1 even for tracers that 
are conserved. But one can only imagine it. If this was the case, then, by symmetry the bias factors for 
rix = dvx/dx and rjy = dvy/dy should be the same (assuming x and y coordinates are transversal to the radial 
2 coordinate). Since r^y -\- oc 5 on the large, linear scales, it is clear that there can be no change in the 
mean field on average due to a change in y at fixed 5 = 0. There will be, however, second-order effects. 
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b{z = 2.5) 

-0.015 

0.0 

0.015 

~ 0) 

0.260 

0.275 

0.290 

II 

0.024 

0.0 

-0.024 

o' 

II 

0 

0.715 

0.725 

0.735 

m 

0.707 

0.702 

0.697 

0-8 

0.795 

0.816 

0.836 


Table 1. Parameters of peak-background split simulations. 


In the following, we start with simply analyzing the density field from these hydrody¬ 
namic simulations, using the total matter overdensity output <5 from the simulation at each 
point to transform to a flux field via the FGPA in Equation 2.3. We progressively add the 
redshift space distortions (RSD), and thermal broadening by hand to this flux field as de¬ 
scribed below in Section 4. Before applying the FGPA to the density field, we smooth it 
using a Gaussian hlter 6{k) —)• 5(A;)exp(—/c^R^/2) to account for Jeans smoothing present 
in baryon fields. We always measure our quantities at different levels of smoothing, in¬ 
cluding unrealistically large smoothing factors, since analytical expressions are expected to 
perform better at more aggressive levels of smoothing. Unless stated otherwise, we assume 
A = 0.3((1 -|- z)/{\ + 2.4))'^'^ and a = 1.6 as in [26]. Finally, it is only in Section 6 , that 
we analyze the flux field from the hydrodynamic part of these same simulations, in order to 
compare it to the one generated from this FGPA prescription. 

We calculate true bias parameters using two methods, which we describe in the following 
two sub-sections. 

3.1 Peak-background split applied to simulations 

This amounts to numerically evaluating expressions in Equations 2.1 and 2.2 and has been 
used to measure the bias parameter by McDonald well over a decade ago [16]. 

For density simulations, this entails running two additional simulations, acting as over- 
and underdense counterparts to the fiducial simulation. In order to achieve this, the starting 
cosmological parameters for these simulations have to be changed. The main requirement is 
for the density in the perturbed universe to satisfy p'{t) = p{t){l + Si(t)), where <5/ represents 
the matter overdensity with respect to the background universe (which can be thought of as 
the long wavelength mode overdensity in this region of the universe). From this relation, the 
cosmologies for the perturbed simulations can be derived. Requiring a 6i value of 0.015 at 
z = 2.5 (the central redshift of interest for the Lyman-a forest ), the perturbed cosmological 
parameters in Appendix A can be evaluated at z = 0 to give the values in Table 1. All three 
simulations listed in Table 1 are evolved from Zi = 159 with outputs at 2 : = 3, 2.8, 2.75, 2.6, 
2.5, 2.4, 2.25, 2 . 2 , and 2 . Here we focus our analysis on redshifts 3, 2.5, and 2 . Since the 
evolution of redshift with time is different in the perturbed universe we rescale the density 
field of the perturbed simulation from the nearest redshift output z to match the required z' 
as seen in Equation A.4 using the growth factor. (The matching 2 ' for example to z = 2.5 
is z' = 2.47 for bi = -1-0.015). We check that this scaling matches the results achieved by 
interpolation methods to better than 0 . 01 %. 

After the perturbed simulation density fields are rescaled to the correct redshift, the 
over-densities are further rescaled to be expressed with respect to the the mean of the unper¬ 
turbed simulation by —>■ (5 p®''*( 1 -|- J/) -|- J;- This Jp®''* is then used to calculate the mean 
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flux in the FGPA approximation when computing the numerical derivative in Equations 2.1 
and 2.2 for the peak background split method analysis. 

For the velocity gradient bias, the calculation is less painful. The prescription in Equa¬ 
tion 2.12 postulates that we need to imagine our simulation box is now of the size L‘^xL{l—r]i), 
where we set r]i = ±0.01 to be the long-scale r] we differentiate with respect to. This means 
that before calculating the mean quantities, we need to i) divide all r’s by (1 — rji) and ii) 
divide the thermal broadening parameter by (1 — r/;). These are precisely the prescriptions 
that McDonald gives in his 2003 paper [16]. 


3.2 Direct mode-by-mode measurement 

As a cross-check, we also calculate the bias parameters using the standard way by comparing 
the cross matter-trace power spectrum with the matter power spectrum. We formalize this 
by writing the likelihood assuming scatter e to be normally distributed with variance Pn, i.e. 


logL = const. — 


N 

~2 


logTW 


E 

modes i 




6FiS,r],bs,br,))^ 

2Pn 


(3.1) 


where 6F,i are the actual measured flux modes and 5F{6,r],bs,brj) = bs6 ± br^rj is the corre¬ 
sponding prediction from actual total fluctuation modes <5j and bias parameters. We maximize 
this likelihood over free parameters bs, /3, Pj\f and their k'^ and fey corrections (6 parameters 
in total) and use the second derivative matrix to derive the marginalised error assuming 
Gaussianity. 

By analytically maximizing the likelihood, one can show that this measurement is es¬ 
sentially the same as measuring the bias from the ratio of the cross-power to the dark matter 
auto-power spectrum. Measuring bias from the cross-power spectrum rather than the flux 
auto-power spectrum avoids biases due to the presence of the noise term Pn, but this is 
likely a very small effect. The main advantage of going through likelihood is that this 
method correctly propagates the scatter between the modes into an uncertainty in the bias 
determination. 

We have tested this method extensively on toy problems. Nevertheless, to our great 
consternation, we note that while results are in general in good agreement with those from 
the peak-background split, there are cases where they disagree significantly. We believe that 
these are due to two kinds of effects. First, our box size of 40 Mpc/h is limited and we do 
not correctly account for the effect of large-scale modes. This plagues the PB split method 
as well, but there at least the effect of /c = 0 mode is propagated correctly. In fact the 
fundamental mode, k = 27r/(40 Mpc/h), is on the verge of being significantly non-linear at 
z = 2.5. Second, although we correctly account for the k"^ dependence of bias parameters, 
we do not fit for other bias parameters that surely exist at this level, most importantly 62 
and bg 2 [29] along with similar parameters for the r] field. 

In [17], the authors take a somewhat different approach towards fitting bias parameters 
and fit the power spectrum measurement with an attempted guess at the shape of non-linear 
corrections to the A: —0 limit. Despite having a larger box they similarly hnd that solutions 
are sensitive to details of the fitting formulae. 


4 Results 

We now show our results through a series of plots that use the same symbol coding. Results 
from analytical predictions are plotted as lines, peak-background split results as dots and 
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Figure 1. Upper left panel shows the mean value of moments of the matter field 5" at redshift 2.5 
as a function of smoothing scale. Other panels show their bias factors: from peak-background split 
(filled dots) and the analytical prediction (black line). 


mode measurements as triangles with error-bars. These are usually plotted as a function of 
smoothing scale, unless noted otherwise. 

4.1 Moments of the density field 

In Figure 1, we plot the theoretical bias for moments of the density field in our simulations 
together with predictions from Equation 2.7 for n = 2, 3, and 4. We see that even though 
(5^ becomes non-linear at i? > 2 Mpc//i, the bias predictions work surprisingly well down to 
R ~ 0.3 Mpc/h. In fact, as long as the variance of the field is not-significantly larger than 
unity, the predictions for d”' bias are accurate, despite (d”’) being large, presumably due to 
isolated high-density regions. 


4.2 Taylor expansion approximation to the r field 

We start by a simple test, that is demonstrating the hopelessness of using a Taylor expansion 
in modeling the flux fluctuations. Namely, we can expand flux as 


F = exp (—r (1 + Sr)) = exp (—r) 


1-T6r + - ■■■ 


(4.1) 
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Figure 2. Density bias of the flux fluctuations after calculating the flux field as a Taylor expansion 
in optical depth around its mean (see text). The black line represents the full flux calculation (no 
Taylor expansion). The green line represents the Taylor expansion to 2nd order, the cyan represents 
4th order, and magenta represents a 6th order expansion. All flux calculations are done for real-space 
FGPA with A = 0.34 and a = 1.6 for z = 2.5 using the peak-background split method described in 
the text. 

and then directly test the applicability of this expansion in simulations, by using a truncated 
Taylor expansion in place of an exponential when creating the flux field in simulations, but 
then proceeding to calculate biases as usual. Discrepancies between the truncated exponential 
expansion and the true field will show what is the best any analytically method based on 
Taylor-series expansion could do. 

Results for the simple real-space FGPA bias are shown in Figure 2. We see that the 
expansion works fine at large smoothings, but fails to describe reality at small values of 
smoothing kernel size. This demonstrates that any approximations that attempt to describe 
the Lyman-a forest bias in terms of bispestrum, trispectrum, etc of the r held are bound to 
fail. What really drives the change of the bias is the change in the flux PDF in the presence 
of the large-scale mode. 

4.3 Real-space FGPA 

In Figure 3 we plot the corresponding predictions for br ,5 and bs for real-space r. Analytical 
predictions agree remarkably well with measurements for this deterministic relation and are 
in fact much better than one would naively expect given Figure 1. We And that the agreement 
for flux is better than the agreement for the optical depth held. This is likely because the 
exp(—r) transformation suppresses the high-density regions where linear theory is supposed 
to break down. 

The agreement between the theory and the measurement hinges on the transformation 
of the flux PDF on the modes as given by the second order linear calculation in Equation 
(2.5), which one would naively expect to be swamped by higher-order non-linear effects on 
small scales. This good agreement is not a coincidence. If we set 1^2 = 0 (dashed line in 
Figure 3) we get completely wrong results, so the magic number U 2 = 34/21 stemming from 
second order standard perturbation theory is indeed crucial to the good fit we obtain in 
this case. As demonstrated in section 4.2, this success does not stem from the second order 
perturbation theory correctly capturing the bispectrum of r fluctuations, but rather, it is 
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Figure 3. Measured and predicted values of &t,i 5 (left panel) and bs (right panel) in real space as 
a function of smoothing scale for z = 2 (magenta, top), z = 2.5 (black, middle) and z = 3 (cyan, 
bottom). Solid lines are analytical formula predictions, solid points are PBS results and triangles 
mode fitting. Dashed lines show analytic predictions with 1/2 = 0. 


because Equation (2.5) correctly captures how the flux PDF is transformed in the presence 
of the large-scale mode. 

We also see that the analytical prediction works increasingly well with decreasing red- 
shift, in correspondence with the decreasing value of A. With no smoothing (R = 0), the 
analytical prediction is within 4% of the peak-background split results for z = 2, where 
A = 0.17, within 6% for z = 2.5, where A = 0.34, and within 18% for z = 3, where A = 0.62. 
The analytical prediction therefore works best for low values of the optical depth, which de¬ 
creases with decreasing redshift, despite the increase in the matter power spectrum. On the 
other hand, if A is held artificially constant between redshifts at A = 0.34, the corresponding 
values are 16% for z = 2 and 7% for z = 3, as expected, for linear theory to hold better at 
higher redshift. If the density field is smoothed with a smoothing factor of ii = 0.2 Mpc/h, 
thus decreasing the overall value of the optical depth, the analytical prediction is still within 
4% of the peak-background split results for z = 2, where A = 0.17, but now 0.6% for z = 2.5, 
where A = 0.34, and within 0.8% for z = 3, where A = 0.62. 

The slightly worse agreement between the numerical and analytical methods at higher 
redshift is therefore driven by the increasing value of the normalization, A. The IGM is less 
ionized at these redshifts, resulting in a higher optical depth. The higher A therefore may 
pick out the few extreme regions of the simulation, skewing the bias measurement towards 
these regions. Since the sample size of these regions inside the volume being probed by our 
simulations is small, the error on this measurement will be higher at larger redshift, limited 
by this sample variance. 

4.4 Redshift-space FGPA 

We introduce the redshift-space distortions in three different ways of increasing realism to 
test how well do theory predictions fare. 
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4.4.1 Linear Kaiser RSD 

First, we impose the redshift-space distortions by simply adding them to the real-space r 
FGPA field as expected in the linear theory: 

Ts = Tr{l + flJ.'^Si) (4.2) 

This is justified by the fact that the r field is conserved on real-redshift-space transformation 
and hence br^rj = 1- We refer to fields created in such a way as “linear Kaiser RSD” (IKRSD).^ 

4.4.2 Linear velocity RSD 

Next we produce an actual redshift-space r by shifting the r field around using appropriate 
velocities, following 

Ts{s) = j T{r)6D{s — r — H~^v)dr, (4.3) 

but actually using linear velocities Vk = ifaH{a)6{k)fj.k/k for this. We refer to this method 
of generating RSD as “linear velocity RSD” (IvRSD). 

4.4.3 Non-linear velocity RSD 

Finally, we use equation 4.3, but use actual non-linear baryon velocities. We refer to this 
method of generating RSD as “non-linear velocity RSD” (nlvRSD). 


4.4.4 Thermal Broadening 

In addition to the velocity offset in a simple RSD picture, the absorbing neutral hydrogen gas 
also has some thermal motion with respect to the background due having a gas temperature 
T. This additionally introduces a Gaussian line broadening profile: 






(4.4) 


where Ua is the new cross section written in terms of the original Lyman-a absorption cross 
section at rest, do, broadened by a Maxwellian velocity distribution of the gas with a Doppler 
parameter b = ^^2kT/m, where m is the mass of the atom, and An is the velocity difference 
with respect to the center of the absorption line. 

In effect, RSD and thermal broadening together move the optical depth by a corre¬ 
sponding peculiar velocity and spread out the value at that point with a Gaussian profile to 
neighboring pixels with a width based on the temperature of the gas at the original point. 
We therefore expect thermal broadening to be degenerate with the brj value of the pure RSD, 
as these are both line-of-sight effects. 

We use the tight temperature-density relation to introduce this thermal broadening, 
with b‘^ = (12.8kms“^)^ ( kKr- ) 7 — 1 = (2 — a)/0.7 and Tq = lO^iF. 


4.4.5 Results 

For the peak background split model calculations, we rescale and smooth the peculiar veloc¬ 
ities just as the RSD density in the previous section (with an additional factor of /). We use 
this rescaled velocity field for the peak-background split calcnlation for the simple redshift 
space helds. 

^ Note that one cannot introduce redshift space distortions using = St,. + since this drops the 

cross terms, which are not negligible, and produce widely disparate results for smaller smoothing scales than 
the formula above. 
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Figure 4. Flux bias parameter for z = 2.5 and A = 0.34 and a = 1.6, while varying the smoothing 
parameter B. Black represents real-space, red redshift space, and blue introducing redshift space and 
thermal smoothing, as described in the text. Lines, points and errorbars correspond to theory, PBS 
method and likelihood methods of bias determination. The top left panel corresponds to IKRSD, top 
right to IvRSD and bottom left to nlvRSD (see text for description of acronyms). The bottom right 
panel shows the difference between the analytical predictions and the PBS method. The dotted lines 
represent IKRSD, dashed IvRSD and solid nlvRSD. 


Our results for the flux density bias are displayed in Figure 4 in red. We see that as we 
progress towards more realistic redshift-space distortions, the theory predictions for the den¬ 
sity bias remain in rough qualitative agreement with the measurements, but that the formula 
is accurate only to within an order of magnitude. Looking first at the PBS results, we note 
that while purely Kaiser distortions (IkRSD) seem to keep the predictivity of our formulae, 
even taking into account the non-linearities induced by actually moving the r field around 
(instead of imposing fj,^ distortions) already introduces errors at the level of 20%, which then 
increase even further when we go to correctly non-linear velocities. Adding thermal broaden¬ 
ing correctly captures the decrease in bias at small smoothing, but underpredicts the overall 
effect. When thermal broadening is applied on nlvRSD, the formula is accurate to around 
10% for R = (10“^ — 10“^)Mpc//i, but this seems to be mostly due to fortuitous cancellation 
of errors (since it works worse in the case of IvRSD). We also note a surprising discrepancy 
between the likelihood method for measuring bias and the PBS in the case of nlvRSD and no 
thermal broadening. This can be directly tracked to the fact that there are missing velocity 
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Figure 5. Same as Figure 4 but for 6^. 

non-linearities in the 40 Mpc/h box that cause the offset in PBS measurements with respect 
to the theory predictions - if we replace the velocity field in the offset simulations with that 
of the original simulation, the difference between likelihood and PBS biases disappear. 

Figure 5 shows the same for the velocity gradient bias. In this case, in case of no 
thermal broadening, the theory predictions are exact and show perfect agreement with the 
PBS method. Note however that the introduction of thermal broadening has a devastating 
effect on the accuracy of this prediction, making it underestimate the bias by up to 30%. 
However, we note that this difference could actually turn out to be a sensitive probe of the 
thermal broadening and thus the temperature of the IGM, but we defer this study for future 
work. Again we note curious discrepancies between the power spectrum bias predictions and 
the PBS, which are occasionally larger than what one would expect based on errors alone. 
Our tests indicate that this is unlikely due to underestimation of errors, but can be due to 
missing rj bias parameters and simply small box effects. 

5 Dependence of A and a 

Figure 6 shows the dependence on the values of A and a for a smoothing scale of R = 0.2 
Mpc/h for z = 2.5. The agreement of the second-order analytical prediction of [26] with the 
peak-background split method in real-space is to within 10% at this central redshift for a 
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Figure 6. Flux bias parameter for z = 2.5 and R = 0.2 Mpc/h, while varying the parameter A 
at constant a = 1.6 (left panel) and varying a at constant A = 0.34 (right panel) for IvRSD. Black 
represents real-space, red redshift space, and blue introducing redshift space and thermal smoothing. 
As before, solid lines are analytical formula predictions, solid points are PBS results and triangles 
mode fitting. Dashed lines show analytic predictions with v-i = 0. 


wide range of values for A and a. Again, a first-order perturbation theory with z ^2 = 0 fails 
miserably. 

However, after the redshift-space distortions are introduced, both methods seem to fail 
at about the same level, regardless of values of A and a and only weakly capture the trends. 

Again, we note an interesting discrepancy between PBS bias determination and the 
power spectrum based one, increasing towards large values of A. These correspond to more 
aggressively picking particular regions of the space where finite volume effects play a more 
important role. 

6 Hydrodynamical fields 

We now turn to the full hydrodynamic part of the simulation, taking the flux field generated 
directly from the simulation. We rescale the flux field to match the mean values of [30]. The 
analytical and likelihood bias methods are calculated directly from these fields, where the 
values of A and a are taken from a smallest chi-squared fit of the real-space r field with 
respect to the underlying density field. 

However, for the peak-background split method, the situation is a bit more complex. 
We note that while the density field in the perturbed simulations evolve as expected with 
respect to the fiducial simulation, the ionizing background is evolved with the redshift of that 
simulation, not corresponding to the time of the fiducial simulation. We therefore have to 
rescale the fitted value of A to fit that of the fiducial simulation at the same corresponding 
global time. This is further complicated by the fact that we do not have outputs of these 
perturbed simulations that would correspond to the exact same global time as those of the 
fiducial simulation (we do not have outputs at the values of z' as stated in Section 3.1). 
In order to circumvent this, we first interpolate (with a polynomial fit) the rescaling of 
A in the fiducial simulation with respect to redshift, we then calculate the redshift of the 
fiducial simulation that would represent the same global time as the output of the perturbed 
simulation (for example, z' = 2.5 corresponds to a z = 2.53 in the fiducial), and rescale the 
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Figure 7. Flux density bias (left) and velocity gradient bias (right) parameters for the hydrodynamic 
part of the simulation plotted in green with redshift, where the flux is scaled to match the means as 
measured by [30]. As a comparison, the result of the FGPA + RSD + thermal smoothing method 
of the previous sections is plotted in blue for the corresponding A and a as fitted in the simulation, 
and appropriately scaled to the mean with a smoothing of i? = 0.04 Mpc/h. The corresponding 
symbols have the usual meaning, where triangles are the likelihood method calculation, circles the 
peak-background split result, while the solid lines are the analytical predictions. 


optical depth at that output according to the rescaling function of A (in the example, this 
would correspond to rescale the perturbed simulation to match A(z=2.53)). After rescaling 
the optical depths to the appropriate values of A at the output redshifts, we calculate the 
mean values of flux, and then again use an interpolating function for this mean flux as a 
function of redshift, to calculate the mean flux values at the appropriate values of z' according 
to Equation A4. This method introduces an error due to the interpolating functions, but 
could be rectified in the future by generating outputs of the perturbed simulations at the 
appropriate values of z'. 

We compare this to the result of the FGPA -|- RSD (nlvRSD)-l- thermal broadening 
method of the previous sections, with a smoothing value of R = 0.04 Mpc/h, which corre¬ 
sponds to the value where the analytical predictions of the density bias in real-space flux fields 
were the closest to those of the real-space hydrodynamic flux predictions. The comparison is 
plotted in Figure 7. 

It is hard to read too much into this figure, since the uncertainty in the flux determi¬ 
nations is significant. We chose a value of smoothing so that predictions based on the flux 
PDF are in closest agreement and these seem to be good to 10%. The numerical value of the 
smoothing is in the right ball-park but it is not the same as the filtering scale of [14] . We find 
decent agreement between FGPA approximated fields and hydrodynamical fields, especially 
at higher redshift. 

We find a curious upturn in bs with redshift, both in the PBS determination of bias and 
the analytical predictions. This is likely the result of errors in our bias determination and a 
failure of analytical predictions at high values of A, since linear theory predicts a power-law 
change of bias with redshift [26]. Observations of the bias trend are consistent with our 
results for 2 < 2 < 2.5 [31]. 

The above values calculated with the FGPA method also do not include scatter around 
the t — 6 relation. We investigate the effect of scatter by incorporating a scatter term into the 
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relation, where r = rFGPA(l + e); following [26], where e is taken from a normal distribution 
with a standard deviation of = 0.15 as measured from the hydrodynamic part of our 
simulations. This amount of scatter results in the FGPA analytical predictions decreasing in 
absolute value by less than 1% for both bs, and bfj at the central redshift of 2 ; = 2.5, and is 
negligible for the numerical bias results. The addition of scatter therefore is not an important 
effect in the comparison above. 

For the velocity bias results, we are not able to generate the peak-background split 
method result, but we present the likelihood method and the analytical prediction, again 
comparing to the FGPA -|- nlvRSD -|- thermal broadening methods in Figure 7. The hydro- 
dynamic and FPGA fields are in good agreement with each other, but while the analytical 
formula for both nicely predicts the relative change in brj as a function of redshift, it fails 
to predict its absolute value. For both fields the absolute value of brj is within 27% of the 
mode-by-mode fit for z = 2.0, growing to a slightly worse agreement at 2 : = 3.0, where the 
absolute value of the analytical prediction is within 37% of the simulation mode-by-mode fit. 

7 Discussion &: Conclusions 

We have applied the FGPA framework in order to test the analytical form for the flux density 
bias and the velocity gradient bias against the numerical methods of the peak background 
split numerical derivative and the mode-by-mode likelihood fit. We have found that the 
short scale density perturbations evolve as predicted by the second-order perturbation theory 
assumed by [26] in the expected regime (5 < 1). This is also reflected in the optical depth 
fluctuations bias, which agrees with the numerical methods in the same regime. 

When we turn to the flux bias, however, the analytical method seems to work surpris¬ 
ingly better than for the optical depth bias, agreeing even in the smallest smoothing scales 
with the numerical methods. This may be a reflection of the fact that, unlike the optical 
depth, flux becomes more sensitive to the accurate description of less dense regions, where in 
the real space FGPA framework, F = exp(—A(1 -|- <5)“), the higher densities are suppressed. 
Indeed, the analytical prediction of the flux bias in Equation 2.11, as discussed in [26], is very 
sensitive to void regions. These less dense, void regions, are exactly where linear and second 
order perturbation theory correctly describe the mapping between densities in a typical re¬ 
gion and an equivalent matter distribution riding a large-scale over or under-density. Thus 
the analytical form of the bias based on these assumptions will hold especially well for the 
flux field. An argument supporting this idea is that Taylor-expanding the optical depth field 
to a finite order does not reproduce the bias to the same accuracy and hence it is not the 
predictions of the 2- or 3-point functions that make the 2nd order perturbation theory work 
but instead the sucess in describing the matter density PDF change in the relevant regime. 

When we introduce redshift space distortions, the predictions for the density bias pa¬ 
rameter becomes considerably less accurate. By attempting increasingly realistic ways of 
introducing the redshift-space distortions we note that this is associated with non-linear 
mapping between real and redshift-space even for linear velocities. Adding thermal broad¬ 
ening further degrades any agreement. 

For the velocity gradient bias we show that the expression given in [26] is exact in the 
limit of no thermal broadening. Our numerical results support this conclusion, effectively by 
construction. However, when thermal broadening is taken into account the predictions can be 
off considerably amounts. This indicates that an appropriate combination of measurements 
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of the small scale flux PDF and the large scale bias parameters might provide a path towards 
robust measurements of the thermal state of the IGM. 

Finally, we measure bias parameters from simulations using two different methods: 
a peak-background split method and direct mode-by-mode cross power spectrum method. 
While we find that they in general agree very well, we have found evidence that our 40 
Mpc/h boxes still suffer from finite box effects and that “one loop” bias parameters and noise 
contributions might be important. It is crucial that a physics-based parameterization of the 
Lyman-a forest bias parameters is developed, especially to allow for a robust measurement 
from finite-sized boxes. We find that bigger A values might put more emphasis on the more 
extreme/void regions of the density box, skewing the bias measurements towards these regions 
which are not well represented by the size of the box. At the same time, the fundamental k 
mode for the mode-by-mode fit is on the verge of being non-linear. Therefore further studies 
will need to include bigger box size simulations to fully understand this discrepancy. 

We also note that precise measurements using multiple boxes in the PBS formalism is 
difficult to do precisely: even if one has outputs at the precise redshifts (something we do not 
have), there are errors associated with external non-gravitational inputs such as reionization 
which is parameterized by redshift and hence occurs at slightly wrong times in offset boxes. 
For the future, it is therefore probably easier to inject a single mode with large amplitude 
into a single box and measure bias values off that particular mode. 

We conclude that while analytical methods can provide good insight into the inner work¬ 
ings of the biasing of the Lyman-aforest, they ultimately fail to provide sufficient accuracy 
for precision cosmology. This is not due to their description of non-linear density field, since 
the match is excellent for real-space flux, but rather the messiness of small-scale velocity 
effects. 
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A Parameters of peak-background perturbed simulations 

Here we derive the parameters needed to evolve an over- or under-dense simulation, rep¬ 
resenting over- or under-dense regions of the fiducial cosmology. The main requirement is 
such that p'{t) = p{t){l -b where 6i represents the matter overdensity with respect to 

the background universe (which can be thought of as the long wavelength mode overdensity 
in this region of the universe). This <5; evolves linearly with the growth factor such that 
6i{t) = D{t)6o and d6i{t)/dt = H{f)f{t)5i{t), with D{f) is the linear growth factor, 5o is the 
value of 6i at z = 0, and f{t) is the derivative of the logarithmic growth factor. 

Since p{a) oc a~^ and dlnp/dt = dlnp/dad = —3H{t), one can derive the bubble param¬ 
eter for the perturbed simulation to first order in the long overdense mode: 

^ (Inp(i) + In (1 -b di{t))) 

-3H\t) ^ -3H{t) + Hit)f{t)5iit) 

(A.l) 

Using this result we can derive the value of Ulm for the perturbed simulation, again to 
first order in the perturbed wavelength: 


p'{t) = = H\t)nm (1 + m) 


U' Ki Q 


1 + ( 1 + ) ^l{t) 


(A.2) 


In the same way, we derive the value of the perturbed Ul\, where we assume the dark 
energy density does not change: 


PaW = PA{t) = 




1 + y{t)di{t) 




(A.3) 
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Since the perturbed simulations will now have different cosmologies, the evolution of 
time with redshift is different as well. In order to use these over or underdense simulations at 
the same time (not redshift) as the fiducial simulations, we derive at what redshift outputs 
the perturbed simulations will have the same time as the unperturbed simulations. Using 
the relation p{t) = pq{1 + zY where po is the density at z = 0 of the hducial simulation, we 
require that all times the ratio = 1 + 5;(t) = is satisfied. Rearranging this ratio 

we then arrive at: 


giving us: 


(1 + 


(1 + ^) 
(1 + ^) 


+ «.(*)) 

Po 

3 1 + 6{t) 
l + 5o ’ 


1 + z' 


a+z) 




(A.4) 


Lastly, since the density changes, so will the value of as- Since this value also scales 
with the growth factor, we can express it in terms of the value at z = 0: 


asiz) = asD{z)/D{0). 


Then we can use the fact that the ratio of this value of the unperturbed to the perturbed 
simulation approaches 1 at inhnite redshift 


asD{z)/D{<d) 
a^h a'^D'{z')/D'{t)) 


lim 


= 1 , 


(A.5) 


thus arriving at: 


, _ D'{t)) D{z) 

fs — O'S" 


L»(0) D'{z') 
g'{z = 0) 1 + z g{z 

— - l\z=0 


1 


0-8 


g{z = 0) 1 + z' 
g'jz = 0)1 + z' 
g{z = 0) 1 + z 


g'{z') 1 + z 


0-8 


0-8 


1 


1 + 3 


g'jz = 0) 
g(z = 0) 
g'{z = 0) 1/ 

■ s'*" 


oo) - (5o) 


(A.6) 


where we used the fact that D{a) = g{a)a and g{a) —)• 1 as a —)• 0. Here we also pick 
our equality point such that z = z' = 0 at the same time to- To derive the parameters 
in Table 1, we read off these derived perturbed parameters at z = z' = 0, where we pick 
5i{z = 2.5) = 0.015, which is then scaled to arrive at the corresponding value of 5o at z = 0 
for the formulae above. 
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